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Information field dynamics (IFD) is introduced here as a framework to derive numerical schemes 
for the simulation of physical and other fields without assuming a particular sub-grid structure 
as many schemes do. IFD constructs an ensemble of non-parametric sub-grid field configurations 
from the combination of the data in computer memory, representing constraints on possible field 
configurations, and prior assumptions on the sub-grid field statistics. Each of these field configu- 
rations can formally be evolved to a later moment since any differential operator of the dynamics 
can act on fields living in continuous space. However, these virtually evolved fields need again a 
representation by data in computer memory. The maximum entropy principle of information theory 
guides the construction of updated datasets via entropic matching, optimally representing these 
field configurations at the later time. The field dynamics thereby become represented by a finite 
set of evolution equations for the data that can be solved numerically. The sub-grid dynamics is 
thereby treated within auxiliary analytic considerations. The resulting scheme acts solely on the 
data space. It should provide a more accurate description of the physical field dynamics than sim- 
ulation schemes constructed ad-hoc, due to the more rigorous accounting of sub-grid physics and 
the space discretization process. Assimilation of measurement data into an IFD simulation is con- 
ceptually straightforward since measurement and simulation data can just be merged. The IFD 
approach is illustrated using the example of a coarsely discretized representation of a thermally 
excited classical Klein-Gordon field. This should pave the way towards the construction of schemes 
for more complex systems like turbulent hydrodynamics. 
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I. INTRODUCTION 

A. Motivation 

Computer simulations of fields play a major role in 
science, engineering, economics, and many other ar- 
eas of modern life. Computer limitations require that 
the infinite number of degrees of freedom of a field 
are represented by a finite data set that fits into com- 
puter memory. For example in hydrodynamics with 
mesh codes, the average density, pressure, and veloc- 
ities of the fluid within grid cells form the data. The 
data makes statements about the field properties, and 
the simulation scheme describes how the present data 
determines the future data. This dynamics is usu- 
ally set up such that the continuum limit of an infi- 
nite number of infinitesimal dense grid points recovers 
the partial differential equations governing the physi- 
cal field dynamics. However, there are many possible 
schemes to discretize the differential operators of the 
field equations. Which one gives good results already 
at finite resolution? Which one takes the influence 
of processes on sub-grid scales best into account? To 
address these questions, a rigorous approach to con- 
struct accurate simulation schemes, information field 
dynamics (IFD), is presented here. IFD rests on in- 
formation field theory (IFT), the theory of Bayesian 
inference on fields [TJ[2]. In the ideal case, IFD and 
IFT provide identical results, since both can be used 
to make statements about fields at later times given 
some initial data. However, in real world applications 
of simulation schemes, compromises with respect to 
accuracy and computational complexity are often un- 
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avoidable. Thus IFD can be regarded as a particular 
approximation scheme within IFT, which may or may 
not provide optimal results from an information the- 
oretical point of view. 

The basic idea is that IFT turns the data in com- 
puter memory into an ensemble of field configurations 
which are consistent with the data and the knowl- 
edge on the sub-grid physics and field statistics. The 
differential operators of the field dynamics can then 
formally operate on these field configurations without 
the usual discretization approximation. An unavoid- 
able approximation finally happens when these time 
evolved fields get recast into the finite data represen- 
tation in computer memory. The information theo- 
retical guideline of the Maximum Entropy Principle 
(MEP) is used in order to ensure maximal fidelity of 
this operation, which we call in the following entropic 
matching. The sub-grid dynamics is thereby treated 
within an auxiliary analytic consideration. In the end, 
an IFD simulation scheme for the time evolution of a 
field is a pure data updating operation in computer 
memory, and therefore an implementable algorithm. 
Although this algorithm does not explicitly deal with 
a field living in continuous space any more, it was, 
however, derived with the continuous space version of 
the original problem being very present in the math- 
ematical reasoning. The sub-grid information, which 
IFT used to construct the virtual continuous space 
field configurations, is encapsulated implicitly in the 
resulting IFD scheme. Therefore IFD schemes act 
solely on the data in computer memory without using 
any explicit sub-grid field representation. 

When constructing a computational simulation 
scheme for field dynamics, whether using IFD or not, 
one is facing two bottlenecks: finite computer mem- 
ory and finite computational time. This work deals 
only with the first issue, and explains how to con- 
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struct schemes which optimally use the data stored in 
computer memory. Optimizing with respect to only 
one objective, memory in this case, very often results 
in solutions which are ineffective with respect to an- 
other aim, computational simplicity here. Thus we 
do not expect the resulting IFD schemes necessarily 
to be the optimal solution for a concrete computa- 
tional problem. Deriving practically usable schemes 
will often require additional approximations in order 
to reduce the computational complexity. The IFD 
framework can, however, help to clarify the nature 
of the approximations made and guide the design of 
simulation schemes. 

The concrete problem of how to discretize a ther- 
mally excited Klein-Gordon (KG) field in position 
space will illustrate the usage of the theoretical IFD 
framework. 



B. Previous work 

Our main motivation is to aid the construction of 
simulation schemes, for example in hydrodynamics, 
for which a very rich body of previous work exists. Ap- 
pendix[A]discusses briefly the relevant concepts of par- 
tial differential equation discretization, sub-grid mod- 
eling, and information theoretical concepts in simula- 
tion schemes and their relation to IFD. 



Structure of this work 



In Sect. ITT] we introduce the necessary concepts of 



IFT, MEP, and IFD. In Sect. |TTTjIFD is developed in 
detail on an abstract level, as well as for the illustra- 
tive example of a KG field. The fidelity of IFD and a 
typical ad-hoc scheme for the KG field are compared 



numerically and against an exact solution in Sect. IV 
Section IVl contains our conclusion and outlook. 



II. CONCEPTS 
A. Information field theory 

The idea of this work is that the data stored in a 
computer is only a constraint on possible field con- 
figurations, but does not to fully determine a unique 
sub-grid field configuration. Instead, the ensemble of 
possible field configurations is constructed using IFT. 
IFT blends the information in the data and any prior 
knowledge on the field behavior into a single probabil- 
ity density function (PDF) over the space of all field 
configurations. 

IFT is information theory applied to fields, proba- 
bilistic reasoning for an infinite set of unknowns, the 
field values at all space positions. It provides field 
reconstructions from finite data. For this IFT needs 
data, a data model describing how the data are deter- 
mined by the field, and a prior PDF summarizing the 
statistical knowledge on the field degrees of freedom 
prior to the data. How this works in our case will be 



shown in the following. A general introduction to IFT 
can be found in [2] and in the references therein. 

IFT exploits mathematical methods from quantum 
and statistical field theory. The unknown field <j) is 
regarded as a signal, a hidden message to be revealed 
from the data d. A prior PDF V(4>) describes the 
knowledge about the signal field prior to the data, 
and a likelihood PDF V{d\(j)) describes the probability 
of the data given a specific signal field configuration. 
Bayes' theorem allows one to construct the posterior 
PDF 



V{cj>\d) = 



V{d\4>)V{4>) 

v(d) ■ 



(1) 



which summarizes the a posteriori (after the data is 
taken) knowledge on the signal field. The connection 
to statistical field theory becomes apparent, when one 
realizes that Bayes theorem can also be written as 



V(<t>\d) 



,-H(d,<f) 



Z(d) 

with the information Hamiltonian 



(2) 



H(d,< 



\ogV(d,, 



■logV(d\cj>) - log V (</>), 



and the partition function 



Z(d)=V(d)= / V<j>V{d,cf>) = / Vcj>e 



-H{d,t 



(3) 



(4) 



Here, J Dip denotes a phase space integral over all pos- 
sible field configurations of <fi, a so called path integral. 

The information Hamiltonian combines prior and 
likelihood into a signal energy, which determines the 
signal posterior according to the usual Boltzmann 
statistics. This Hamiltonian therefore contains all 
available information on the signal field. 

The simplest IFT case is that of a free theory. This 
emerges in case three conditions are met: 

(i) The a priori distribution of the field is a 
multivariate Gaussian, 

with signal covariance $ = (<f> (ft) ^ — 
JV^Vifycfxft , its determinant |$| = 
det$, and (frip = J dx (f> x tjjx denoting the 
scalar product. 

(ii) The data depends linearly on the signal 
field, 

d = R<j> + n, (6) 
with a known response operator R. 

(iii) The noise n = d — R <j) is signal- 
independent with Gaussian statistics 



V(n\<fi)=g(n,N), 



(7) 



where N 
$VnV(n\4>) 
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In this case, the likelihood V(d\(f>) — V(n = d — 
R<j)\4>) = G(d — R(f),N) and the prior V((j>) contribute 
terms to the Hamiltonian that are at most quadratical 
in the signal. Thus, the Hamiltonian is also quadrat- 
ical, which is the mark of a free theory. In this spe- 
cific case, the information Hamiltonian states that the 
posterior field is also Gaussian, but with shifted mean 
to = (4>) = J T)(j) <ftV(<ft\d) and uncertainty vari- 
ance D, which can be read off from 

H(d, (f)) = -(d- i?0) f iV" 1 (d - R<p) + -<^$- V 

D -i j 

= i (</> - to) 1 D- 1 (0 - to) , (8) 

with 

to = Dj = + R^N^Ry 1 R^N- 1 d = Wd. 
w 

(9) 

Here and later "=" means equality up to irrelevant 
constants^ In analogy to the quantum field theory, 
an information propagator D = + R^ A r ~ 1 i?) _1 

and an information source j = R) N _1 d can be iden- 
tified. The information source j is given by the data 
d, weighted by the inverse noise covariance ./V -1 and 
back-projected with the hermitian adjoint response R? 
into the signal space. The a posteriori mean field m x 
at some location x of the signal space is constructed by 
transporting the information j y sourced by the data 
at some location y to x with the help of the informa- 
tion propagator D xy . This happens by applying this 
as a linear operator to the information source field 
m x — j dy D xy j y . The resulting posterior mean field 
depends linearly on the data, to = Wd. The corre- 
sponding linear filter operation W is well known in 
signal reconstruction as the (generalized) Wiener fil- 
ter [3] . The information propagator D is also identical 
to the a posteriori uncertainty variance, 

D = ((<!>- m)(<l>-m)i) (m , (10) 

also known under the term Wiener variance. To 
conclude, in free IFT the posterior is Gaussian with 
Wiener mean and variance, 

V(<j>\d) = G(<f>-m,D). (11) 

Although the field mean to is a continuous function 
in the signal space, a full field with an apparently in- 
finite number of field values, it has strictly speaking 



only effectively a finite number of degrees of freedom 
due to its construction. Since the mean field is a deter- 
ministic function of the data, to = m{d) = W d, the 
phase space of possible mean fields can have at most as 
many dimensions as the data has degrees of freedom. 
This sets a limit to the maximal possible accuracy a 
simulation scheme can achieve with finite data repre- 
sentation of the field. However, in this work, we do 
not only evolve the mean field, but the full distribu- 
tion of plausible fields around this as characterized by 

It should be noted that there exist two equivalent 
formulations of the Wiener filter operator 

w = ($^ 1 + r^n~ 1 r)~ 1 r^n- 1 

= <f> (R$ + N)' 1 . (12) 

The first one is called the signal space and the second 
one the data space representation, since the operator 
inversions happen in signal and data space, respec- 
tively. They are fully equivalent as long as $ and N 
are regular matrices rj 

The data space representation of the Wiener fil- 
ter W = $i? (i?$i? f + iV) _1 can cope with the 
here relevant case of negligible noise, N — > 0, lead- 
ing to W = §R (RG>R^)~ . This is possible only if 

$ = the data space image of the signal field 

covariance, is (pseudo)-invcrtiblc, which is very often 
the case. If not, the data contains redundancies that 
could be used to tailor the data space until is in- 
vertible. 

This noiseless limit might be a desirable assumption 
for dealing with the data of a numerical simulation, 
since one might define the data to represent a state- 
ment about the field like d — Rip exactly, without any 
uncertainty in data space. However, in the course of 
a field dynamical simulation, the knowledge of the ex- 
act field configuration </> might not be present at later 
times due to unavoidable discretization errors. There- 
fore, a mismatch of the data d in computer memory 
and the correct discretized statement R <fi for the true 
field might develop and this can be regarded as noise 
n = d— R(p. Furthermore, a full error propagation 
of initial value uncertainties in a simulation might be 
of interest in case the initial data resulted from a real 
measurement with instrumental noise. For these rea- 
sons, we will keep the noise term in the formalism. 

The Wiener filter theory described so far gives us a 
sufficient IFT background for this initial work on IFD. 
It should be noted, however, that in case of non-linear 
relations between data and signal, or non-Gaussian 
signal or noise statistics, IFT becomes an interacting 



1 This is of course a context dependent convention, since it 
depends on what is regarded to be relevant. In the context 
of this work, any field dependent quantity is relevant. Field 
independent normalization constants of PDFs are not. The 
sign "=" is here used as the logarithmic partner of the sign 
"<x" , since normalization constants become constant additive 
terms after taking the logarithms. Later on, we will also 
regard terms of higher order in the time step St as irrelevant, 
since they can be made to vanish by taking the limit St — > 0. 



2 The equivalence of the two Wiener filter representations is 
easily verified via the following equivalence transformations: 

(<J> _1 + R* N' 1 ^ _1 fff TV" 1 = $ fit {ft $ fit + " x 
<=> R t N~ 1 R<!>R t + R t = R t + i? 1 " N' 1 R$> i? f . 
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field theory, and the resulting operations on the data 
to calculate a posteriori mean and variance become 
nonlinear. Such operations can be constructed using 
diagrammatic perturbation series, re-summation and 
re- normalization techniques [US], or by the construc- 
tion and minimization of an effective action, the Gibbs 
free energy [5J |6] . In many cases, the posterior is well 
approximated by a multivariate Gaussian, which we 
assume in the following. 



B. Entropic matching 

We assume now that an ensemble of field configu- 
rations for a time t has been constructed with IFT, 
those being consistent with the data d — dt and any 
background information at that time. It has to be 
specified now how those evolve, and how this can be 
represented by an updated dataset d' = d t ' at a later 
time if. 

Each of the possible field configurations is assumed 
to evolve for a short period according to the exact 
physical field dynamics. In order to recast this evolved 
ensemble of field configurations back into the data rep- 
resentation of the computational scheme, an updated 
data set has to be constructed. The field ensemble im- 
plied by the updated data should resemble the evolved 
field ensemble of the original data as close as possible. 
We will use entropic matching for this, the usage of the 
MEP without any additional constrains. The MEP is 
the principle of our choice since it derives from very 
generic and desirable first principles on how to update 
a probability without introducing spurious knowledge. 

For the MEP, entropy is just regarded as an ab- 
stract quantity that can be used to rank various pos- 
sible PDFs according to how well they are suited to 
represent a knowledge state. A large entropy resem- 
bles an uninformed or ignorance state. MEP aims 
therefore for the least informed state that is still con- 
sistent with all known constraints. This should be the 
state with the least spurious assumptions. 

A number of intuitively obvious requirements on the 
internal logic of such a ranking fully determines the 
functional form of this entropy [7HTU] . These require- 
ments are that local information should have only lo- 
cal effects, that the ranking should be independent 
of the coordinate system used, and that independent 
systems lead to separable PDFs. These requirements 
are further detailed in Appendix[Bj The only function 
on the space of PDFs that is consistent with these 
principles is the entropy 



S(P\Q) 



Vcj>V{(f>) loj 



(13) 



where V{4>) denotes a PDF for some field (j> to be 
ranked for its ignorance, and Q(4>) an a priori igno- 
rance state. This entropy is the relative entropy of 
information theory, the Kullback-Leibler divergence 
of V to Q [TU]. It is in general also equivalent (up to 
some constant) to the Gibbs energy of thermodynam- 
ics [5], and to the Boltzmann-Shannon entropy in case 
the ignorance knowledge state Q does not favor any 
region of physical phase-space, i.e. Q((f>) = const. 



Since the information entropy is equivalent to the 
Kullback-Leibler distance of information theory, it can 
also be used to match one PDF optimally to another 
one. This entropic matching will be needed in this 
work in order to find the data constrained represen- 
tation of the field PDF at a later instant that best 
matches the time evolved PDF of an earlier instant. 
In case V{4>) can be changed at any phase-space point 
tf>, maximizing S(V\Q) will reproduce the ignorance 
prior V — > Q. If there are, however, constraints limit- 
ing the flexibility of V{<j>) to adapt to Q(<f>), the MEP 
solution will be different. Such constraints can be im- 
posed with the help of Lagrange multipliers, respec- 
tive thermodynamical potentials, which can be used 
to imprint certain expectation values onto V as it is 
shown in Appendix [B] In this work, constraints arise 
due to the fact that the degrees of freedom to repre- 
sent functions and PDFs in computers are limited by 
the size of the computer memory. 

To be concrete, we write <fi' = 4>t' and assume for 
definiteness only that the short time step St = t' — t 
permits a deterministic and invertible functional re- 
lation between <j)' and the earlier <f> — <p t , so that 
W|</>) = S(<j}' - <j)'{(t>)) as well as V(<j>W) = S{<j> - 

Here and later, we assume further that the target 
knowledge state Q in our case is given by the Gaussian 
signal field posterior V(<j>\d, t) = Q(<j> — m, D) at time 
t as specified by the data d = d t and the background 
knowledge at this time, however evolved according to 
the dynamical laws to a later time if , so that 



= QW)-m,D) 



dej)' 



(14) 



The state V' we want to match to this using the 
MEP is one that can be represented by a new set 
of data d' — df at this later time via the IFT pos- 
terior V'(<t>') = V{<t>'\d') = G ((/>'- m',D'). Since 
the data degrees of freedom are finite, the PDF im- 
plied by this new data (via m' = W'd' and D' = 
($'-! + i?'tiV' _1 i?') _1 ) will be of a parametric form, 
with the new data being the parameters. However, 
the evolved PDF will in general have a different func- 
tional form. Therefore, a matching between the PDFs 
£"(</>' |d') and Q{<j>') is needed and using the MEP for 
this ensures that the least amount of spurious informa- 
tion is introduced in this unavoidable approximative 
step. 



C. Simulation schemes construction 

The IFD methodology to discretize the dynamics of 
a field can be summarized with the following recipe: 



3 Stochastic terms could easily be incorporated into the dy- 
namics, e.g. by setting "P(0'|0) = S(4>' — </>'(</>)> St H) in case 
of additive Gaussian and temporally white noise §t with co- 
variance (£tf !/)(£) = S(t — t')£. This is a straightforward 
extension of the scheme presented here . 
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1. Field dynamics: The field dynamics equations 
have to be specified. The KG equation, which 
can be derived from a suitable Hamiltonian, will 
serve as an example in this work. 

2. Prior knowledge: The ignorance knowledge 
state in case of the absence of data has to be 
specified. In our example the field will be as- 
sumed to be initially excited by contact with 
a thermal bath of known temperature. The 
Hamiltonian determining the field dynamics will 
therefore also determine the background knowl- 
edge on the initial state in our example. 

3. Data constraints: The relation of data and 
the ensemble of field configurations being con- 
sistent with data and background knowledge 
has to be established using IFT. Assimilation of 
external measurement data into the simulation 
scheme is naturally done during this step. 

4. Field evolution: The evolution of the field en- 
semble over a short time interval has to be de- 
scribed. This either involves the evolution of the 
mean and spread of the ensemble, or — as we 
will use here — the analytical description of the 
evolution of all possible field configurations. 

5. Prior update: The background knowledge for 
the later time has to be constructed. In the 
chosen example, energy and phase-space conser- 
vation of the Hamiltonian dynamics guarantee 
that the same thermal ignorance state also holds 
at later times. 

6. Data update: The relation of data and field 
ensemble has to be invoked again to construct 
the data of the later time using entropic match- 
ing based on the MEP. Thereby a transforma- 
tion rule is constructed that describes how the 
initial data determines the later data. This 
transformation forms the desired numerical sim- 
ulation scheme. It has incorporated the physics 
of the sub-grid degrees of freedom into opera- 
tions solely in data space. 

An IFD simulation scheme resulting from this recipe 
acts only on the data space. Any sub-grid dynamics is 
encapsulated implicitly. This is ensured by the auxil- 
iary analytic considerations that construct the ensem- 
ble of possible field configurations, evolve them ana- 
lytically in time, and map them back onto the data 
representation using entropic matching. 



III. INFORMATION FIELD DYNAMICS 

The IFD program outlined above shall now be dis- 
cussed in detail and by following the recipe of Sect. 

The discussion will only deal 



II C| step by step, 
with linear dynamics and Gaussian knowledge states. 
Many interesting problems involve nonlinear dynam- 
ics, and consequently should lead to non-Gaussian 
knowledge states. However, the construction of a non- 
linear IFD theory will have its foundation in linear 
theory, which therefore needs to be developed first. 



► In order to illustrate the IFD methodology, the 
problem of how to discretize the dynamics of a ther- 
mally excited Klein-Gordon field in one-dimensional 
position space is chosen as an example. Since exact 
solutions of the field dynamics can easily be given in 
Fourier-space representation, an exact, sub-grid field 
model exists in this case to which numerical solutions 
using IFD and other discretization schemes can be 
compared. Passages dealing specifically with this ex- 
ample are marked as this paragraph and might be 
skimmed over on a first reading. 



A. Field dynamics 

The linear dynamics of a field 4> can in general be 
written as 



d t <t> = c + Li 



(15) 



where L is a linear operator acting on the field vec- 
tor of a time instance, thereby determining the field's 
time derivative. L can be a differential operator, it 
can include integro-differential operations, and it can 
depend on time. A dependence on earlier field values 
is excluded from L, which is therefore assumed here 
to be local in time. The field independent, but poten- 
tially time and position dependent additive term c is 
a source term of the field. 

Nonlinear dynamics of the form 



dtX = F(x) 



(16) 



can often be cast approximatively into the form ( 15 ) 



via a Frechet-Taylor expansion around a sufficiently 
good and known approximation ip for \ — V" + 4>'- 

d t cj> = F{i})-dt^ + d i ,F{^) + O(0 2 ). (17) 



One obvious choice of such an approximation would 
be to use a static function ip t = \t for some short 
period [to, t{\ and afterwards ipt = Xt ± for the next 
such period, always ensuring <p to be small and second 
order effects to be negligible. 

Stochastic terms in the evolution equations can also 
be included into the formalism, however, here we re- 
frain from such complications and assume fully deter- 
ministic dynamics. If higher time derivatives are part 
of the linear or linearized evolution equation, these 
can be included as further components of <p. 

► For example, the one dimensional Klein-Gordon 
(KG) equation for a real scalar field with mass /j, 



d*<p = {d 2 x - m 2 V, 



(18) 



which will serve as a concrete example in this work, 
can be cast into the form ( 15 1 by setting (j) = {f >7r^)' 
and 



dt 





(9 2 -M 2 ) 



(19) 
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Here, it = dtf is the canonical momentum field of the 
KG field tp, which can be discriminated by context 
from the number tt. The dagger denotes transposing 
and complex conjugation of functional vectors so that 
ip'j = J dxip x j x = J dk (fkjk/(2ir) in real and Fourier 
space, respectively. The scalar product of two compo- 
nent fields <j) = (^)t,0Wt)t and ip = (^t^OOtJt 
is 



t,/, — 



dx 



(20) 



in real and Fourier space, respectively. 

The KG field dynamics can be derived from the 
quadratic Hamiltonian of the dynamical system 

H[tj>) = \fE<l> (21) 



dk 
4?r 



(|7r k | a + (M 2 + * a )|^| 2 ) 



in abstract, position space and Fourier space nota- 
tion, respectively. Here and in the following, x and y 
are coordinates in position space, k and q coordinates 
in continuous or discrete Fourier space, t is a time 
coordinate, and coordinate labels determine in which 
functional basis a component of a field is to be read 
out. The kernel E of the Hamiltonian reads, in the 
Fourier basis, 



E kq = 2n5{k - q) 







k 2 



This determines the KG dynamics via 

d t 4> = sa 4 ,u{<i>) = SE4>, 

with the symplectic matrix 



S 



1 

-1 



(22) 



(23) 



(24) 



Therefore, the linear time evolution operator is L = 
S E and the temporal source is c = in our example. 

The Fourier space representation of the KG dynam- 
ics, (df + k 2 + n 2 )(fk = 0, has the solution 



fk 



_ „LUlt _|_ -= — LUlt 



7r fe = iw (afce 



a-fce 



') 



(25) 



with uj = ^/fc 2 + /i 2 , i = \J— 1, and a k s C. With 
respect to the remaining degrees of freedom, the com- 
plex amplitudes a/-, the Hamiltonian becomes 



H(a) 



dk 



a k \ 2 (k 2 



(26) 



which implies that these variables arc stationary, 
d t ak = 0. Therefore, an exact high resolution solution 
can be specified for the KG example for all times. This 
will be compared to approximative low resolution so- 
lutions provided by simulation schemes derived from 
IFD and by the usual discretization of differential op- 
erators as described in Appendix (All. -4 



B. Prior knowledge 

The signal field prior V(<j)) has to be specified. The 
prior should summarize the data-independent knowl- 
edge on the field configuration at current time t. For 
practical reasons, one will typically approximate it by 
a Gaussian 



V(<t>) = G(<p-ip,$) 



(27) 



{4>) (0) and P r i OT 
Such 



with properly chosen mean field ip 
uncertainty variance $ = {((f> — ip) ( r , , 
an approximation is often possible, since even non- 
Gaussian knowledge states are typically sufficiently 
well approximated by Gaussians. Any sophisticated 
treatment of the otherwise resulting non-linear, inter- 
acting IFT is beyond the scope of this paper. 

The Gaussian prior can also be justified from a pure 
information theoretical point of view. In case only the 
prior mean ip and variance $ are known from physical 
considerations, the MEP distribution of the field (f> 
representing exactly this knowledge is given by the 
Gaussian (27) with this mean and variance, as shown 
in Appendix |B | 

Any known mean field ip can easily be absorbed by 
the redefinitions (f> — > <j)' = 4> — ip and c — > c! = c + Lip. 
This, however, might create a c-term even if none ex- 
isted initially in the dynamical equation. Therefore 
we keep the possibility of a prior mean in the formal- 
ism, but note that there is some freedom to trade a 
prior mean ip against a field independent c-term and 
vice versa. 

► For our illustrative example of a KG field, we as- 
sume that the field was initially in contact and equi- 
librium with a thermal reservoir at temperature j3~ 1 
and became decoupled from it at some time to = 0. 
The initial probability function of the field is therefore 
thermal, 



i 

— i 



- 2/ 3K| 2 (fc 2 +A1 2 ) 



(28) 

It separates into independently excited modes, which 
do not exchange energy at later times because the am- 
plitudes are stationary. Thus, an initially established 
thermal state stays thermal and at the same temper- 
ature for all times. The partition function is given by 
a complex Gaussian integral for each mode and is 



Zp = / VcPe 



-pu{<i>) 



n 



2/3 (k 2 + /i 2 ) • 



(29) 



where the product goes over all accessible positive 
wave vectors. 

Since the energy Hamiltonian H(<fr) = \$E§ is 
quadratic in <j>, the prior information Hamiltonian 
H{<t>\P) = PU{(p) = ^(p ] E(P is quadratic as well. The 
prior is simply a Gaussian V{<f)\P) = G(<f>, 3>) with zero 
mean ip — and covariance $ = (ft E)^ 1 . In Fourier 
space this reads 



27T 

*fc 9 = -j S(k - q) 



(30) 
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Figure 1. (Color online) A realization of a thermally exited KG field ip x (a) and its momentum distribution n x (b) is 
shown for /3 — 1 and /j, = 1 at t = with a resolution of 2048 pixels with black lines passing through the diamond 
symbols. The low resolution data with J\f = 64 data points describing the same fields are shown with yellow diamonds. 
The field configuration at t = 0.1 is also shown in panel (a) with a thin brown (grey) line. The KG field <p x shows a 
correlated structure due to the suppression of small scale power by the gradient term in the Hamiltonian, whereas its 
momentum field n x is just white noise. The loss of small-scale structure information in the low resolution sampling is 
especially apparent for the momentum data. 



and in position space it is 



4>, 



1 fJL e -ii\x-y\ 



2 M 









S(x - y) 



(31) 



A KG field realization drawn from ( 28 ) for /3 
H = 1 is displayed in Fig. 



1 and 

There, the different 



spatial correlation structures of the field values with 
(<Pxfy) (0) = (2/U/3)~ 1 e~ A, l a:_J 'l and field momenta with 
("KxTty) i m = l3~ 1 S{x — y), as given by (31 ), can be seen. 



C. Data constraints 

In addition to the relatively vague prior knowledge, 
the field is constrained by the finite dimensional data 
vector d — (<i,), in computer memory. The data is 
assumed to represent linear statements on the field 
of the form d = Rs + n, c.f. Eq. Q. In typical 
numerical simulation schemes, the response operator 
might just express an averaging of the field within 
some environment of a grid point Xi G Oj, i.e. 



1 



(x € fij 



(32) 



where the logical theta function 



quantity through the surfaces of the tessellation cells 
might be used as data. In smoothed particle hydro- 
dynamics, the volumes overlap and are usually also 
structured by radially declining kernel functions that 
have evolving locations and sizes. 

For the moment, we only have to deal with the data 
at one instant, and need only to know that it depends 
linearly on the underlying field by a known relation of 
the form d = R<j) + n. This relation might or might 
not be the same at the next instant, depending on 
the design choices for R = R t (stationary grid or La- 
grangian moving mesh). R t could even be determined 
by the IFD formalism itself by requiring minimal in- 
formation loss of the scheme, as we will do later for 
the KG field example in Sect. 1III Fh. 



The simulation data vector d can even be extended 
also to contain measurement data on the system to 
be simulated (e.g. the weather) obtained for the cur- 
rent simulation time. If this auxiliary data O resulted 
from a linear measurement o = d\4> + n with response 

and Gaussian noise n with covariance 9T, only the 
replacements 



R 



and N 



N 

o m 



(34) 



are needed]^] This way, the measurement information 
is assimilated into the simulation scheme and can be 



(x e Sli) = V(x e Qi\x,Qi 



1 x G fl{ 
x ^ 



(33) 



is one, if the condition in its argument is true, oth- 
erwise it is zero. In schemes based on grid cells or 
space tessellations, the grid point volumes are dis- 
joint, Vli n Q,j — for i j. In case a conserved 
quantity should be conserved as accurately as possi- 
ble, the total amount of the quantity within the cells 
of a space tessellation as well as the currents of the 



4 The block diagonal structure of the extended noise covari- 
ance matrix assumes that the measurement error and the 
simulation error are uncorrelated. This assumption would be 
improper in case repeated measurements with the same in- 
correctly calibrated instrument are assimilated into the sim- 
ulation. In that case, correlations among the simulation 
and measurement data errors could exist since the correlated 
measurement errors are partly imprinted onto the simulation 
data. 
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evolved into the future (or into the past, if the simula- 
tion is backward in time). The added data could be- 
come simulation degrees of freedom, or they could be 
discarded at the next simulation time step after their 
information was transferred to the simulation data via 
the entropic matching operation. The former option 
would certainly conserve more information, the latter 
is somehow similar to what is done in particle filter 
methods as described in Appendix |A 3| 

The ensemble of field configurations constrained by 
the data via Q and by the prior via (27 1 is then 



V(<f>\d) =0(0-171, D), 



(35) 



where 



D = 



and 



m = ij + W (d~R^)=D (R^N^d + $" V)(36) 

The mean is shifted here with respect to Q due to 
the non- vanishing prior mean ip. 

In case that external data is to be as- 
similated into the simulation, applying replace- 
ments of ( [34] ) to ( 36 1 and expanding this yields 
D = + R^N^R + m^m^my 1 and d = 

D (RtN-id + Ztfyi-iD + Q-i'ip). Thus, data assim- 
ilation is very naturally done in IFD since simula- 
tion and measurement data shape the field posterior 
V((f>\d) = Q(4> — m, D) in a similar way. 

► In our example of the KG field we want to 
deal with the simplest possible data as given by (|6| 
and (32) that lives on a regular grid, with equidis- 
tant, space filling and disjoint pixel volumes f2 s ; = 
[i A, (i + 1) A), with A > being the grid spacings. 
Since on a computer one can only deal with finite do- 
mains, we assume periodic boundary conditions for 
the interval f2 = U^fii = [0, 2tt] and require that the 
number of grid points Af — 2ir/A E hi. The Fourier 
transformed field is then 



»k = 



•in 



dx t 



with 



00 

E 



2tt 



a — tkxi 



(37) 



(38) 



Here the following substitution with respect to the 
infinitely extended case have been made: J dx — > 

dx and / § -> IX-00 which are the appro- 
priately weighted sums of the scalar products in posi- 
tion and Fourier space, respectively. Furthermore, we 
note that S(k — q) — > 5k q in this case, so that the unit 
operator is tkg — 2ir5k q and the field covariance (301 
reads 



kq 



2tt 

1 



6k. 



kq 







(39) 



Since the data space is finite, its Fourier space is 
also finite, where 

Af-i 

d k = Ae tfc4A d 2 , with (40) 

i=0 



k=0 



and k G {0, . . . Af — 1}. Higher or negative Fourier 
modes do not carry any additional information due to 
the Nyquist theoremjj 

The Fourier transformed response, 



R kq = 2ir0(q-keAfZ) 



-uqA 



(42) 



= 2tt% - k e AAZ)e"5 t(zA sinc Q<7 A ) >( 43 ) 

is block diagonal in the reduced Fourier space of the 
data with k € {0, . . . Af — 1}. Note, however, that 
higher Fourier modes of the field cj) q with q e k + 
jVZ, which carry information on sub-grid structure, 
imprint also onto the data and blend with the lower 
Fourier modes k £ {0, . . . AT — 1}. Therefore a unique 
reconstruction of the individual Fourier modes from 
the data alone is impossible even within the range 
qe{0, ... Af-1}. 

The individual terms in ( |42[ ) can easily be under- 
stood. The exp(— hiqA) term stems from the fact 
that the centers of the pixel volumes arc shifted by 
I A from the pixel positions iA used in the definition 
of the Fourier transformation. The sinc-function is the 
Fourier space transform of the pixel window. It en- 
codes how well a given Fourier mode is represented in 
the data, and therefore how well it is protected from 
noise and confusion with other modes imprinted onto 
the same data mode. 

The data space signal covariance, which is needed 
by the Wiener filter, 



kq 



(R<5>R 



' kq 



((C) 
kq 





4> 





(tt) 
kq 



with 



_ kg 
kq ,,2 



kq 




2 sinh(//A) sin 2 (| k A) 



(i A cosh {p. A) — cos (k A) 



(44) 



k = 
fc^O' 



:os(fcA) 
2sin 2 (ifcA) 



k = 
fc^O 



Since the field covariance and response are transla- 
tionally invariant we have every reason to believe that 
the noise statistics, which are fed only by approxima- 
tion errors depending on these latter two quantities, 



5 These conventions for the discrete Fourier transformation 
might appear a bit unusual, but they have the advantage 
that they match best the continuous space Fourier conven- 
tion used in physics. They permit us to use all derived Fourier 
space equations for the KG field without changing normaliza- 
tion constants and with the intuitive identifications dx — » A, 
x — > iA and k — > k. 

6 Here, we used the following identities: 



E 



1 



and 



E 



b 3 



(a + j) 2 sin (it a) 



1 



(a + i) 2 ((a + i) 2 + b 2 ) 

biv sinh(27rb) 



sin (ira) cosh(27rb) — cos(27ra) 
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will also be translationally invariant in data space. 
Therefore its covariance will also be diagonal in dis- 
crete Fourier space: 



N kq = 2irS, 



kq 



(c) (tt) 



(45) 



where 77^, rj^, and r/ c ) are the noise spectra of 
the field value data, the field momenta data, and the 
cross-spectra of those, respectively. However, in Sect. 



(IIIFl we will show that the ideal 1FD scheme stays 



noiseless if it was initially noiseless. Therefore we can 
set N — > for all times and use the ^-parameters to 
ensure consistency of all formula. They will be set to 
zero at the end of the calculation if this is a permitted 
limit. 

Taking the noiseless case as granted for the moment, 
the Wiener filter becomes 



w kq = (W-l- 1 ) 



k q 



(46) 



2tt% = k mod TV) e 5 tfcA sinc 
2sin 2 (±<jA) 



cos(gA) 



ti 2 +k 2 



1 - 



2 
ft A 



ih(^A) sin^(ifcA) 



cosh(/iA)- 





cos(fcA) 



For a reconstructed signal image generated by this 
Wiener filter, any image Fourier mode k G Z gets 
exited by its first Brillouin zone data space mode q = 
kmodAf G {0, . . . N — f}. Thereby, all Fourier modes 
k G Z of the mean field m — W d get some non-trivial 
value if the corresponding data mode fcmodA/" was 
non-zero. < 



D. Field evolution 

A Gaussian knowledge state "P(0|£) = V(<j>\ d — 
d(t)) = Q(cf> — m, D) at some initial time t is repre- 
sented by the data d = d t , which determinesthe mean 
field via m — W d. The field uncertainty variance 
D is data- and time-independent in our example, but 
not in general. The knowledge state V(<j)\t) has to be 
evolved to a infinitesimally later time if = t + St via 
the evolution of the individual field configurations. 

An individual field configuration 4> = 4> t at initial 
time t becomes <j)' = 4>t' = (f)t + St(f)t = 4>t+ St ( L (j> t + c) 1 
where the time derivative is given by (f5). Here, 



and in the following, we drop non-essential terms of 
0(St 2 ), as indicated by "=". The time-evolved knowl- 
edge state therefore becomes 



P(4>'\d) = P(ct>\d) 



(47) 



by conservation of probability density. We need to 
calculate the Jacobian up to linear order in St. This 
is most simply done from the inverse Jacobian, 



\l + StL\ =explog|l + <fti| 

exp Tr (St L) = 1 + St Tr (L) . (48) 



In case of a linear Hamiltonian dynamics dt<fi = 
S drjjT-L^), with dynamical Hamiltonian of the from 
T-L((f) — \<p^ E <p + b^<p and E being block diagonal in 
the field value if and field momentum n eigenspaces, 
we have L = S E and c = S b. The Jacobian is then 
unity, since 



Tr (L) = Tr (S E) = Tr 



= Tr 



( ° 







I 

-I 



= 0. 







(49) 



This is not surprising, since it is well known that sym- 
plectic Hamiltonian systems conserve the phase space 
density, so that the unity of the Jacobian is also valid 
for non-infinitesimal time steps St in such cases. 

In general, for non-Hamiltonian systems, the Jaco- 
bian can be different from one. It can be larger for 
systems with dynamical attractors or with dissipation 
(Navicr-Stokes equations) and it can be smaller for 
systems with diverging phase-space flows, like chaotic 
inflation in cosmology or driven hydrodynamical tur- 
bulence (without significant dissipation). 

The evolved knowledge state, or the knowledge 
state on the evolved field, is therefore 

V(4?\ d)=V(t/> = 4! -St j>\ d) \d<f)/d<t>'\ (50) 
=G{<t>' ~ St (L(f>' +c)-m, D) (I - St Tr (L)) 
=G(ct>'-m*,D*), 

witrd 

m* =m + 5t(c + L m) (51) 

=(l + 5tL) (ip + W (d- Rip)) + 6tc, 
D*=D + St(LD + DL r ). 
D*~ 1 =D~ 1 - St (D~ l L + L^D- 1 ). 

► In case of our KG field, we have Tr (L) = due 
to the symplectic dynamics with L = S E and c = 0, 
as well as m* = m + St S E m. Furthermore, using 
L = SE, = -S , D- 1 = + R^N^R, and 
$- x = $ E, we get D*- 1 = D^ 1 - St (R^N^R SE- 



7 The key to understand this result is a short rearrangement 
in the exponent of the Gaussian, 

((1 - 5tL)<j>' -m- <5tc) t D- 1 ((l - StL)(p' -m-Stc) 

= {<j>' - ((1 - 5tL)- 1 m + Stc))' i (1 - StL^D- 1 ^ - StL) (<j>' - m') 



the <5t-expansion of the new mean field 

m*=(l + St L) m + St c, 
that of the new uncertainty variance 

D*=(l + StL)D(l + StL) t 
=D + St(LD + D L 1 "), 

and its determinant 

|D*| = (l + 25tTr(L))|D|. 
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The evolved mean field still can be regarded to be 
parametrized by the data, however, in a different way, 
to* = (1 + St S E) W d. It is not clear in general 
whether a new dataset d! can be found that expresses 
this new mean field via the original parametrization 
to' = W d' (or with the appropriate W, in case that 
also D' changed). This is because the functional 
forms of the two parametrizations differ since W and 
L = S E operate on completely different vector spaces, 
the discrete data space and the continuous held space, 
respectively. 

Therefore entropic matching will be used to choose 
a d! that determines V(4>'\ d!) such that it captures 
most of the information content of V((f>'\ d). -4 



E. Prior update 

The held prior for time t' has to be updated since 
the sub-grid statistics might have changed. For exam- 
ple some of the energy contained in sub-grid modes 
might dissipate, leading to a different T((p') — G((f>' — 
ip' , $') as parametrized via the updated prior mean ip' 
and variance 

► In case of our KG held, energy conservation of the 
dynamics leads to an unchanged prior for the evolved 
field V{4>') = G(<j>', $), still with $ = {PE)- 1 . < 



F. Data update 

The new data has to be determined from its relation 
to the updated field. Again, we assume the new data 
to depend linearly on the evolved field 

d! = R' <f>' + ri. 

Note that we could chose a different pixilation at t' , 
leading to a different response R' , propagator D' , and 
Wiener filter W . This is needed e.g. in case a simula- 
tion with moving or adaptive mesh is to be developed. 
It can even be considered that the response operator 
determination becomes a part of the entropic match- 
ing step, leading to an information optimal moving 
mesh. 

Furthermore, we have to allow for a changed noise 
level, with new covariance N' , since the meaning of 
the data values could have changed with changed pix- 
ilation and since we might have to allow for additional 
uncertainty in order to capture any mismatch between 
the new parametrized posterior and the evolved field 
posterior. 



tcrior and new data is 



According to ( 35 ) and ( 36 ) the relation of new pos 



(52) 



V(c/>'\d')=g^'-m',D'), 

where 

D' = (&- 1 + R'^N'- 1 R'y X , 
m' = ip' + W {d! - R' ip') 

= D' (Rf^N'^d' + , and 

W' = D' R'^N'- 1 = &R'i(R' $' +jV')~ 1 . (53) 

<i>' 



Now, the new posterior V' = V{(p'\d') should match 
the evolved posterior V = V{<p'\d) as well as possible. 
According to ( 13 1 the cross entropy of the former with 
the latter is 



S{V'\V) = -~Tr [(6m Sm* + D') D* 



1 ■ 



log(D'D*- 1 )] 



(54) 



with 8m = m! — m* . 

Maximizing this entropy with respect to the new 
data dl yields 

- d d ,S = (d d ,m , yD*- 1 6m 

= W^D*- 1 (W' (df - R'ip')+ip' -to*) = 
d' = R' ip' + (55) 

(W'^D^W'y 1 W'^D*- 1 (to* - V') • 

This is the general formula to update the data. It 
should be expanded up to linear order in all the rele- 
vant changes in response R' = R + 6R, noise covari- 
ance N' = N + 6N, and prior parameters $' = $ + <5<I> 
and ip' = ip + Sip, as well as in time t' — t + St. The 
resulting general formula is lengthy and not directly 
instructive^ therefore we concentrate here more on 
special cases. 

The update of the uncertainty variance is also ob- 
tained by maximizing the entropy with respect to the 
degrees of freedom of D' = ($'+i?'t N'- 1 R'y 1 . These 
could be the location of the new pixel positions, which 
influence R , or an updated noise level, influencing N' , 
or properties of the field prior expressed via $' and ip' . 

We combine these degrees of freedom into the sin- 
gle vector Tj, irrespective of whether they determine 
R' , N', ip', or combinations thereof. The en- 
tropic matching of the updated uncertainty variance 
D' = D( V + Sr]) = D( V ) + 'EtSmTi + 0(6r) 2 ), with 
T,; = d m D{r)) the linear changes due to changes in the 
degrees of freedom, is then given by 



- d v S = ~Tr [{d ri D') (D*- 1 - D'- 1 )] = 

=S> 6r) = C~ 1 b, with 

bi = Tr [r 4 (D*- 1 - D- 1 )] and 

dj = Te[T i D- 1 r j D- 1 ] . 



(56) 



! A few useful identities, when dealing with | |55[ l might be in 
order. A short calculation shows that up to linear order in St 

= + N') (R'&D*- 1 ®'^)- 1 R'& 

= (<£>' + N') (R'0' (D- 1 - St (D^L + L+D- 1 )) R'& 
= (*' + N') (D + 8tbR'$'{D- x L + D' 1 )®' R^ D) R'<!>', 

with D = (R'&D^&R*)- 1 and that 

D*- 1 (m* - V') 

^(d- 1 - St(D~ 1 L + -D" 1 )) ((X + StL) (tp + W (d - Rip)) + Stc) . 
=D~ V + R^N~ l (d - Rtj>) 
+St [d _1 c- L f (d- 1 ^ + R^N- 1 (d-Ripj) 
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From the first line it is already apparent, that if D' is 
able to match D* exactly, then it will do so. The de- 
tailed formula for updating response, noise, and prior 
can be complex, since operator inversions are involved. 
In general, approximations might be necessary here 
in order to proceed with a reasonable computational 
complexity. 

The formula |55| ) and (561 form the desired sim- 
ulation scheme. The scheme deals optimally with 
time dependent pixilation, non-Hamiltonian dynam- 
ics, sub-grid processes, as well as with the accumu- 
lation of discretization errors. The price of this gen- 
erality is a higher complexity of the detailed formula 
compared to many ad-hoc schemes. These formula 
have to be analyzed case by case to identify the op- 
timal numerical implementation strategy. In order to 
show this in a simple example, we turn again to the 
KG field. 

► Assuming that we have all freedom to chose R' , 
N', and $' to match D'- 1 = $'- 1 +i?'t N'^R' exactly 
with 



D* 



R^N~ 1 R - St {R^N^RSE -ESR^N^R) 



(57) 



as derived in Sect. |IIID| we would immediately use 
= $ and try to accommodate the change in vari- 
ance in a changed response or noise. Thus the un- 
changed signal covariance also results from the data 
update via the MEP. The considerations to update 
the prior in Sect. |III E| were therefore superfluous in 
this case. The updated prior mean ip' could also be 
derived by maximizing the entropy with respect to it. 
It is not surprising that it turns out to be ip 1 = ip = 0. 
Writing R' = R + 5R and N' = N + SN we find 



£>'-!=$ + R^ N~ 1 R + SR^N^R 



R^N^SR - R 1 N~ L SN N L R 



tjv-i; 



(58) 



Comparing the terms of the last two equations, we 
conclude that the best match is found by the identifi- 
cation 



SR = -StRSE, 
SN = 0. 



(59) 



Thus, the noise should stay unchanged and can be 
assumed to be zero for all times it was zero initially, 
which we will assume in the following. The response 
of an optimal scheme should however evolve accord- 
ing to d t Rt — —RtSE. This can actually be solved 
analytically, providing 



Rt = i?T_ 



with the time translation operator 



(60) 



(61) 



1 



k Q 



cos(cokt) 



1 
1 



sin(wfct) 



w* 1 
-w fc 



In case we insist on using the original response R for 
all later times, the change in the uncertainty variance 
D* would have been needed to be captured by either 
<&' or by N'. Neither is optimal for this, which is 



why the resulting schemes would lose information in 
the course of the simulation. As we will see in Sect. 
Ill G[ our scheme with evolving response is lossless 
with respect to information. 

For the data update from d = dt to d! = d t > at 
t' — t + St we need only to expand ( p35"| to first order 
in 5t. In our ideal case with N —> we have W = 



QRURt^R],) 



QRliRQRi)- 1 = <J>i?J,$ 



W-$(Rt-Rt'y$- 1 = W-5t$tfRl$- 1 , as a short 
calculation verifies. The data evolution is then 

d' = (W'^D^W'y 1 W'^D^m* 

= (W 1 ^ D*~ 1 W')~ 1 VF' t _D* _1 (l + St L)W d 
= (W'^D^W'y 1 x 

[W'^D*- 1 W' + W I] D*- 1 {W- W' + StLW)] d 
= d+{w" < D*^ 1 W i y 1 W" < D*^ 1 5t x 
($£ f +L§) R\§- l d 

"V" 

o 

= d, (62) 

since = ^E~ X E = -SE^E- 1 = 
Thus d t d t = 0, the data should not be changed, and 
the evolution is completely captured by the response 
evolution. This scheme is optimal from an IFT point 
of view as we will see in the following. Note, that 
the scheme is completely specified in the data space, 
since (62 1 does not require any sub-grid calculations 
as it does not require any calculations at all. It will be 
shown in the next section that the evolution of binned 
field values is also completely specified in data space 
and that the sub-grid field configuration predictions 
require the usage of a finer grid only at the very end 
of the calculation. 

This simple data (non-)evolution equation dtdt = 
is a consequence of our KG example having a lin- 
ear symplectic evolution, as determined by H(4>) = 
E (f) and a thermal prior distribution, as charac- 
terized by H{4>\(3) = /3H(<j)), both depending on the 
same energy matrix E. In general, dtdt ^ can be 
expected as soon as the prior and dynamics are more 
orthogonal in their eigenvector sets. A 



G. Information field theoretical solution 

► The KG problem is exactly solvable and the later 
time field can be obtained from applying a time trans- 
lation operator, as given by (61), to an earlier time 
field. This operator depends only on the time differ- 
ence, <fit> = Tt>-t<fit, and is even invertible, so that 



the earlier field can be calculated from the later one. 
With this, the time invariance of the field covariance 
can easily be verified, 



(63) 



where the last identity requires a few lines of straight- 
forward matrix multiplications using (30) and (61). 

Since we want to infer the future field <p t from the 
initial data d — d t= o, we have to specify how the initial 
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data depends on the future field. This backward-in- 
time response is simply given by 



Rc 



Rt 



Rt 



(64) 



Since we now have the response of the initial data 
d = do to the field <pt as well its variance $t at a later 
time, we can simply write down the Wiener filter mean 
field at time t that is 



m t 



= Wtd = 



(65) 



Here we used the identity R t ^R\ = RT_ t $T^_ t R^ 



i? $ RJ = $ that follows from (63). Therefore, any fu- 



ture mean field can be calculated directly from the 
original data, which therefore does not need to be 
evolved in time. The response Rt and Wiener filter Wt 
operators connecting the field at time t to the static 
data d — dt=o are exactly the ones which were found 
for the ideal IFD scheme. Thus, IFD reproduces IFT 
if the parameters of the future instances are able to 
capture all details of the evolved PDF j^] The sub-grid 
representation of the evolved field as given by (651 



only requires complex operations in data space, since 
$ is fully specified there. Solely the back-projection 
into continuous signal space by R\ and the subsequent 
spectral weighting by <E> require sub-grid operations. 

One might therefore ask how the virtual data d t — 
R(f>t of the original response R applied to later field 
configurations would evolve and if this requires a sub- 
grid field resolution. This is of importance to us, since 
we want to compare the IFD/IFT scheme with ad-hoc 
schemes, which do not need to have a notion of a sub- 
grid structure. Since the future field is not precisely 
known, the correct data at later times can not be spec- 
ified. The best we can do is to calculate the a posteri- 
ori expectation value of this hypothetical future data. 
This ideal data at later time, d t = (d t ) — (R<f>t) (^ t \d) j 
is therefore 



d t = R^RliR^R})' 1 d = T t d. 



(66) 



Note that the time translation operator of the data 
T t is not unity in general, basically it is only T t = t 
for t = 0, since one of the response operators contains 
a time translation of the field: 



T, ) = (r^T^R^- 1 ) 

kq \ / kq 

2 (1 - cos(fc'A)) 



(67) 



k a A 2 



uj k , 2 cos(wfc't) -i^,, 1 sin(wfc't) \ 
u^, 1 sm(cjk't) cos(u> k't) J kq ' 



The observation that an entropic matching approximation 
enforced in any instance of continuous time can result in the 
exact equation for a dynamical system was observed previ- 
ously in an attempt to reconstruct quantum mechanics from 
statistics 1121 . 



Since this time evolution operator is fully deter- 
mined in data space, and the sub-grid mode dynamics 
is just captured by a sum in a pre-factor to the com- 
putational expensive operator ^^q, we can conclude 
that a data space only scheme was derived. The time 
evolving data d t = (dt) contains the same informa- 
tion as d, since the latter can be reconstructed from 
the former via d = T 1 T l d t . We can derive an evolu- 
tion equation for d t by simply taking the temporal 
derivative of ( 66 ) : 



d t d t =(d t T t )d=(d t f t )f t - 1 d t . 

It is obvious that this ideal evolution equation of the 
virtual data according to the original response R is 
not only more complicated than just having an evolv- 
ing response Rt and stationary data, it is also a dif- 
ferential equation with time dependent coefficients. 
This might be surprising, since the dynamical equa- 
tion of the underlying KG field is invariant under- 
time translation. However, this time-translational 
symmetry is broken for our knowledge state on the 
field, for which the time t = of the initial data set 
d = R<frt=o is clearly singled out. The different Fourier 
data modes are mixtures of different field modes, 
which evolve with individual frequencies. Thus, 
the recovery of a similar mixture dk — (R<j>t)k = 



X; ieZ 2 7re-5'fcA sinc ^i fcA 



(T t 4>)k+Ni, with the 



original phases in the response works differently at 
different times, due to the changed phases of the indi- 
vidual modes. Therefore, the optimal IFD differential 
equation for data according to the original response 
becomes time dependent. Nevertheless, we would like 
to have something like a (now time dependent) data 
mode frequency for a comparison with ad-hoc simula- 
tion schemes. An observer of the data dynamics could 
estimate such a frequency in a pragmatic way by using 



dfd k + 
define 



dk = as an analog of dfipk + uj£<Pk = to 



J k.t 



(<p) 

k.t ■ 



(68) 



The resulting frequencies are best calculated numeri- 
cally, since the involved formula ( 67 1 contains an in- 
finite sum without a known closed forms. For t = 0, 
however, a closed form can be derived, 



J k,t=0 



2 sinh(^A) sin (| fcA) 2 
Afj, cosh (/iA) — cos (fcA) 



(69) 



(fc 2 + M 2 )(1 + 



fc 2 A 2 
12 



0(A 4 )), 



that recovers the original continuous space KG fre- 
quency Uk — (fc 2 + M 2 ) 1 / 2 in the limit A — > 0, but 
differs from it for finite grid spacings. The oscillation 
frequency of a data mode is slightly higher than the di- 
rectly corresponding continuous field mode, since the 
former also contains field modes from larger fc, which 
have larger frequencies, due to the mode mixing of 
the response operator. The advanced revolution of 
the field modes at early times will be compensated 
later on by a reduced oscillation speed. The initial 
and later time data dispersion relation is shown in 
Fig. [2] together with those of ad-hoc schemes derived 
in the next section. M 
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Figure 2. (Color online) (a): Fourier-data space dispersion relations ujk of numerical schemes for the KG field simulation 
for the parameters M = 64 and (x = 1. The IFD scheme data mode frequencies &k,t are shown at initial time t = 
as given by ( |69[ l (top, blue dots), an instance later at t = 10 -4 (top, blue solid line with kinks), and at time t — ty/2 
(strongly oscillating blue dotted line). At t = 7r, the IFD scheme dispersion relation looks similar to the initial one. The 



spectral scheme frequencies £^ pec as given by (71 1 (middle, black squares) follow the continuous space field-dispersion 



as given by ( 70 1 has the lowest frequencies 

as a function of 



(thin, smooth, and black line). Finally, the finite difference scheme ujf. lft 

(bottom, brick red triangles), (b): Data space representation of the numerical scheme operator Lij 
the pixel number difference i — j for small differences. The curves are given by the discrete Fourier transformations of 
Gj\ for the IFD scheme at t = (most extreme, blue dots and line) as well as for t = ir/2 (smaller light blue dots and 
blue dotted line close to intermediate black line), the spectral scheme (intermediate values, black squares and line), and 
for the finite difference scheme (most moderate values, brick red triangles and line). It should be noted that the IFD 
operator at t = n/2 also contains some power around positions i — j = =bV/2 = ±32 (not shown in this figure) as a 
consequence of the heavy oscillations of &k,t at this time that are visible in panel (a). 



H. Summary of the derivation 

► A brief summary of the essential steps of the IFD 
recipe applied to the KG problem might be instruc- 
tive: 

1. Field dynamics: The KG equation was con- 
verted into a differential equation of first order 
in time, dt<p — L(j), by the introduction of the 
momentum field ir x = (p x as a second compo- 
nent of a two component field = (ip, . The 
KG equation is a linear as L is independent of <j>. 
This simplified the derivation of an IFD scheme. 
If a nonlinear equation should be simulated, the 
equation has to be linearized around the current 
mean field at any simulation time step. 

2. Prior knowledge: The a priori KG field 
statistics was specified as a thermal distribu- 
tion V(4>) oc exp(-H((j>)/T). The fact that in 
this case the KG Hamiltonian H((j>) determines 
both the dynamical operator L as well as the 
a priori statistics V((f)) turns out to simplify 
the resulting scheme considerably. It is, how- 
ever, not a general necessity for the applicability 
of IFD. The a priori distribution is a Gaussian 
since the Hamiltonian is quadratically in 4>. If 
non-Gaussian priors are to be used it is recom- 
mended to find a Gaussian approximation since 
IFD is developed so far only for Gaussian priors. 

3. Data constraints: As a next step, the com- 
puter data space was introduced. The computer 
data d needs to be related to the field cf> and this 



relation should be linear for practical reasons 
and could be assumed to be noiseless for the 
KG example, d = R<f>. The initial discretization 
operator R was chosen here to perform a simple 
bin average. Therefore the average field value 
in each bin is known if the data is available, but 
not the detailed field configuration within those. 
However, not all possible sub-grid field config- 
urations are equally plausible, since the prior 
gives them different weights. Combining prior 
and data information, the ensemble of plausible 
field configurations can be specified, and char- 
acterized by its mean field m = (</>)(0|d) and un- 
certainty variance D — ({<f> — m) ((j) — m) }(<£U) 
determining a Gaussian a posteriori distribution 
V((f)\d) — G{4> — to, D). This is a Gaussian 
thanks to the Gaussian prior and linear data 
model. The mean field and its variance are aux- 
iliary mathematical objects used in the deriva- 
tion of the simulation scheme that need not con- 
crete representations in computer memory. 

4. Field evolution: The action of the time evolu- 
tion operator on the posterior distribution had 
then to be worked out analytically. Since we in- 
sisted on linear or linearized operators, the time 
evolved posterior is again a Gaussian, V{<j)'\d) = 
Q{(j>' — to*, D*), characterized by an updated 
mean m* and uncertainty variance D* , both 
again auxiliary mathematical objects. 

5. Prior update: The prior of the later time 
might be different and should be updated since 
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it will be used again. However, due to energy 
and phase space conservation of the KG dynam- 
ics, the KG prior is unchanged. This step could 
have been skipped, since the evolution of the 
prior can also be determined as part of the next 
step, the data update via entropic matching. 
However, this requires that the field dynamical 
equation captures all sub-grid physics. If this 
not the case, the prior update step might per- 
mit to implement sub-grid processes not being 
present in the dynamical equation. 

6. Data update: Finally, an update formula for 
the later time data d' in computer memory was 
constructed. This was done by first specify- 
ing the mathematical relationship between any 
such data and the later time field a posteriori 
distribution, V{<j/\d!) = G{<j>' - to', D'), where 
to' = D' R'^d! and R' and D' are response and 
propagator/variance at the later time. Then 
the time evolved distribution V(<j>'\d) and the 
one determined by the new data P(<j>'\d') were 
matched entropically. The parameters used to 
get an optimal match can be any of the later 
time, primed quantities. In the particular KG 
example it turned out to be most effective to 
vary d' and R' in the entropic matching since 
this way an information-lossless scheme could 
be obtained. This scheme maps the entire field 
evolution onto an evolving response operator R t 
and stationary data. We showed that the re- 
sulting simulation scheme is indeed optimal by 
comparison to the exact information theoreti- 
cally derived solution of the future field predic- 
tion problem. Since this particular KG simula- 
tion scheme does not modify the data, we asked 
how the binned field values (with stationary bin- 
averaging) would evolve and derived their evo- 
lution equation. The time translation operator 
of this does also not require any explicit sub- 
grid field representation, but has encodes sub- 
grid physics implicitly. 

The derived simulation scheme can now be imple- 
mented on a computer. The resulting code performs 
only data space operations and does not require any 
sub-grid representation. The sub-grid physics, the 
prior knowledge, and the details of the measurement 
process (the data to fields relation) have all been in- 
cluded in the IFD scheme. 



IV. NUMERICAL VERIFICATION 
A. Standard simulation schemes 

► The IFD scheme for the KG field should now be 
compared to more standard simulation schemes for 
the KG equation as described in Appendix | A 1| 

The most common one is the finite difference dis- 
cretization of the differential operators by setting 
d x (p x « (<P(i+i)A ~ ¥>;a)/A and d%.<p x « (-<P( i+ i)A + 
2<^iA — V(i-i)A)/A . The KG equation discretized in 
this way, d t d = L m d with iff = A- 2 S l[j+lU - 



(2A- 



li 2 ) Sij 



A- 2 5. 



and = j'modjV, 



becomes diagonal in Fourier space, just with the dis- 
persion relation given by 



2A- 2 (l-cos(fcA)). (70) 



This and the IFD dispersion relation are shown in 
Fig. [5] in comparison to the one of the original KG 
field, u! 2 = /i 2 + k 2 . Since the initial IFD frequencies 
are above, and the frequencies of the difference scheme 
are below the one of the KG field, it is also natural 
to consider the latter as another option. Thus we also 
investigate a spectral simulation scheme withPj 



for ke{0, ... Af/2} 



2 , l2 
/~spcc\2 I /* i 

[L ° k ' ~ \ii 2 + {N-k) 2 for k G {Sf/2, . . . J\f} ' 

The Fourier space data evolution equation can be 
solved analytically and has the solution 



a k e 



4 70 = (fifce 1 *** " ajv-fce- 40 **) , (72) 
with the coefficients determined by the initial data 



l k t=0 



u kt=0 



(73) 



Thus, the most efficient simulation scheme for the KG 
field evolution schemes is to evolve the initial data ac- 
cording to these Fourier space equations analytically 
and transform the field back to position space at the 
desired time. 

The ad hoc simulation schemes are best imple- 



mented via ( 72 1 and ( 73 ) , the corresponding data d of 



the IFD scheme according to (66) and (67), whereas 



the full field including the sub-grid modes can be fol- 
lowed via (25 1 . -4 



B. Time evolution 

► To see how well the different simulation schemes 
perform, we simulate a KG field by setting up its 
Fourier amplitudes a k G C up to \k\ = Af<j>/2 drawn 
from V(a k ) = Q(a k , 1/(4/3 (^ 2 + fc 2 ))) ando^-fc =Qfc 
for the "negative" modes, so that (26), p8| and 
4> x G R 2 are satisfied. We use = 2048, pL = 1, 
and /3 = 1. A resulting field realization is displayed 
in Fig. [l] We time-evolve all its Fourier modes ac- 
cording to ( [25] ). The initial and late time exact data 
is generated via dt = R(f>t with the response given by 
(32) for Af = 64 data bins. This means that there 



are N^/Af = 32 independent field modes combined 
in a single datum, ensuring that there is substantial 
sub-grid uncertainty, as is well observable in Fig. [T] 



10 The distinctions of the cases is only necessary here, since we 
use k £ {0, . . . Af — 1} so that the negative frequencies are 
represented by wave numbers in the second half of the range. 
If we would use k £ {— JV/2+1, . . . Af/2} as our first Brillouin 

spcc\2 



we would have (£>^. pGC ) = ^t 2 + k 
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Figure 3. (Color online) (a): Evolved field (thin, black line) and data at t = 10 of the field also shown in Fig. [T] 
(/3 = 1, fi = 1, A/" = 64). The exact data d t = Rip t are shown as yellow diamonds. The IFD data according to (|66[) 



and ( 67 1 (blue dots) follows the exact data closely. The data of the spectral scheme (black squares) is very close to the 



IFD data. The data of the difference scheme (brick red triangles) exhibit the poorest match to the correct data of the 



evolved field. The root mean square errors of the field data values = y Y^iLo 1 — Rf)f /N of the three schemes 
are 0.003, 0.004, and 0.020 for the IFD, spectral, and difference scheme, respectively, (b): Temporal evolution of the 



data error <r^(t) for the IFD (bottom solid blue line), spectral (dashed black line slightly above the former), and finite 



difference (top brick red line) scheme. The dip in the IFD and spectral scheme error at t 
alignment of the mode phases at this particular time. 



7r is due to the nearly perfect 



For the spectral and difference schemes, the data is 



time evolved according to ( 72 ) and ( 73 1 . For the IFD 



scheme, we use (66 1 and (67) to calculate correspond 



ing late time data. 

For time t = 10, the field is shown and the differ- 
ent data sets at this time are compared in Fig. 
This time was chosen for that the difference scheme 
already exhibits some significant but still moderate 
deviations from the correct solution. The IFD and 
spectral scheme are both relatively accurate. A dif- 
ference between them exists, but is hard to see by 
eye in this snapshot. However, a comparison of the 
spatially averaged errors of the two schemes reveals a 
significantly higher accuracy of the IFD scheme with 
respect to the spectral scheme at basically all times. 

Although the IFD scheme has the highest fidelity, 
the spectral scheme is also very good for arbitrarily 
large times. The reason can easily be understood. De- 
spite the fact that any data Fourier mode is a mixture 
of several field modes, the spectral scheme just follows 
the most dominant of these modes, and treats the oth- 
ers as random noise. However, since the main mode 
is correctly captured, it can be followed for infinitely 
large intervals, and the ignored modes just contribute 
a fixed amount of uncertainty. The IFD scheme also 
assigns some power to these higher modes and follows 
their evolution. This is why it has a higher accuracy. 

Optimally, one would have chosen an initial re- 
sponse that maps the first Af Fourier modes of the 
field exactly into the data. Then these modes could 
have been followed with absolute precision, while one 
would have no information on the lower amplitude 
higher Fourier modes. In this case the IFD scheme 
would have been identical to the spectral scheme, but 
it would not have served us well as a sufficiently com- 



plex example illustrating the inner workings of the 
IFD framework. M 



V. CONCLUSIONS AND OUTLOOK 

Information field dynamics serves as a framework to 
derive numerical simulation schemes. It rests on infor- 
mation field theory in order to construct continuous 
space field configurations out of the finite data in com- 
puter memory. It uses the maximum entropy principle 
to construct updated computer memory data so that 
the ensemble of time-evolved continuous space field 
configurations is matched by the ensemble implied by 
the updated data with minimal information loss. 

The data updating operations of an IFD simulation 
time step, as given by (55) and (56), are in general 



complex, and might require the usage of linear alge- 
bra solvers. However, for numerical stability reasons, 
an implicit time step scheme might be adopted for a 
simulation anyway, and the linear algebra operations 
of the implicit and IFD schemes might be performed 
together. 

As an illustrative example, we have derived the opti- 
mal IFD scheme for a thermally excited Klein-Gordon 
field. It could be shown that the resulting IFD scheme 
is identical to the one resulting from IFT. The scheme 
is much more accurate than a simplistic real space dis- 
cretization of the differential operator, and it is still 
significantly more accurate than a spectral scheme. In 
comparison to these two ad hoc schemes with station- 
ary evolution equations for the data, the IFD scheme 
exhibits a time dependent discretization of the differ- 
ential equation. This is due to its ability to follow to 
some level the evolution of the sub-grid scales without 
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representing them explicitly in computer memory but 
capturing their influence implicitly in the data update 
rules. 

This initial work on IFD should be regarded as a 
proposal for how to incorporate information theoreti- 
cal considerations into the construction of simulation 
schemes. IFD permits us to state and include explic- 
itly background knowledge on sub-grid behavior as 
well as external measurement data in a way that hope- 
fully exploits and conserves as much of the available 
information as possible. 

For technical reasons, one might compromise infor- 
mation theoretical fidelity for reducing the numeri- 
cal complexity. Also for this balance, the information 
theoretical language introduced here should help to 
judge the choices. Finally, the language of IFD is 
already what is needed for data assimilation simula- 
tion schemes, as for example used in weather forecasts. 
The next goal of this research line is to develop IFD 
schemes for scientifically and technologically more rel- 
evant problems, like turbulent hydrodynamics. This, 
however, is left for future work. 
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Appendix A: Previous work 

1. Discretization of differential operators 

Most of the dynamical systems in physics are de- 
scribed by partial differential equations. These con- 
tain differential operators acting on the dynamical 
fields. With the finite representation of the fields in 
computer memory, these operators need a discretized 
representation as well. A number of discretization 
schemes have been developed, including finite differ- 
ence methods, finite volume methods, finite element 
methods, spectral methods, smoothed particle hydro- 
dynamics and others. Most of these schemes assume 
a distinct sub-grid structure for the fields, in contrast 
to IFD. 

Finite difference methods [T4], represent differ- 
entials by finite differences between the field values at 
the lattice grid points. These finite difference opera- 
tors are exact if the field is polynomial of the order of 
the operator. Thus a finite difference gradient opera- 
tor implicitly assumes the field to be piecewise linear 
on sub-grid scales, a Laplace operator the field to be 
quadratic and so forth. In Sect. |IV| we will show 
numerically that the IFD operator for the KG field 
evolution is superior to the finite difference operator. 

Finite volume methods [T5] are used when con- 
served quantities arc simulated, such as e.g. the fluid 



mass in hydrodynamics. The space is split into pixel 
volumes. The continuity equations for the conserved 
quantities can be turned into balance equations for 
the fluxes of the quantity through the boundaries of a 
pixel's volume. The simplest assumption for the sub- 
grid field configuration is that it is constant within 
the pixels, with jumps at their boundaries. The re- 
sulting discontinuities have to be treated as separate 
Ricmann problems at the boundaries in hydrodynam- 
ics. A conservative IFD scheme should also be possi- 
ble, if the stored data of the scheme are the amounts 
of the conserved quantity within pixel volumes, and 
the fluxes between adjacent pixels. 

Finite element methods [TBI HZ] a ls° partition 
the space into sub-volumes, the 'elements'. A set of 
basis functions for the field is defined, with a support 
covering only a small number of the elements/pixels. 
The field is represented as a linear combination of 
these basis function, and therefore with a tightly 
parametrized sub-grid structure, e.g. being piecewise 
linear. The partial differential equations are only re- 
quired to be solved weakly, in the Sobolev function 
space spanned by the chosen basis functions. This 
turns spatial differential operators into linear systems 
of equations, which then can be solved on a computer. 

Spectral methods are also Sobolev space based, 
just with the basis functions being Fourier modes. We 
will compare the IFD scheme for the KG field to a 
spectral method and show that IFD provides a slightly 
more accurate simulation. 

Smoothed particle hydrodynamics [18 20] dis- 
cretizes the mass of the fluid and not the space. 
Smoothed particle hydrodynamics is one example of 
Lagrangian methods, in which the 'grid' follows the 
flow. Each mass element has a dynamically evolving 
position and is thought to be distributed over some 
finite ball according to a radially declining and adap- 
tively sized kernel function determining the sub-grid 
field structure. 

Moving mesh codes can be regarded as a com- 
promise between Eulerian schemes with fixed lattices 
and Lagrangian schemes with a co-moving but particle 
based fluid discretization as smoothed particle hydro- 
dynamics [211 122) . Moving mesh codes were recently 
improved by using Voronoi tessellation to create flex- 
ible volume cells around the moving grid points on 
which finite volume methods can be used [35] ■ Thus 
also here the sub-grid field representation is of a pre- 
determined functional form. 

In contrast to these approaches, IFD does not as- 
sumes an a priori shape of the sub-grid field struc- 
ture. It considers all possible sub-grid configurations 
consistent with the constraints given by the data and 
the field equations, but weights them with a priori 
plausibilities. This requires knowledge on the sub-grid 
dynamics. 



2. Sub-grid scale modeling 

IFD, as proposed here, requires prior information on 
all modes of the dynamical field, in order to constrain 
the unresolved degrees of freedom. The necessity to 
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use information on sub-grid scales in simulations was 
already realized for hydrodynamics. For this reason, 
the method ol large eddy simulations was devel- 
oped [2~Hl26| . This resolves the largest scale of a flow 
by simulating a spatially filtered (convolved) dynam- 
ics, in combination with sub-grid scale models that 
try to summarize the effect of the unresolved scales 
on the global dynamics [2"TH5U] . Usually stress ten- 
sors describe the sub-grid scales. These are actually 
velocity fluctuation covariance matrices and there- 
fore conceptually similar to the uncertainty dynam- 
ical field covariances in IFD. Large eddy simulations 
have recently been combined with adaptive mesh re- 
finement methods that increase the resolution at lo- 
cations where small scale dynamics is particularly im- 
portant. This is especially important in astrophysical 
applications, where a large range of scales should be 
followed, as for example in galaxy clusters [3T] [32] . 

In astrophysical hydrodynamics, many addi- 
tional processes on unresolved scales, like star forma- 
tion and radiative feedback, are relevant yet cannot 
be followed in detail. In simulations of galaxies us- 
ing smooth particle hydrodynamics, the interstellar 
medium is often described as a mixture of interacting 
gas phases (molecular, ionized, ...) forming a complex 
weather, with a single effective equation of state 
summarizing these phases [33] ■ However, the trans- 
lation of sub-grid physics into a concrete simulation 
scheme is usually done ad-hoc without considering the 
resolution dependent level of sub-grid fluctuations. 

In oceanography, it has been recognized that 
some information about sub-grid eddy evolution is 
contained in the large scale fluid motions due to the 
practical incompressibility of water and the result- 
ing solenoidality of the flow patterns. Partial recon- 
struction of the sub-grid eddies from a coarse res- 
olution is therefore possible [34]. This has been used 
to construct accurate simulation schemes for advec- 
tive tracers and for vorticity transport [35] [35]. A 
maximum entropy production principle was in- 
troduced in this context in order to construct sub-grid 
configurations that are numerically stable [35] . There, 
maximum entropy was regarded merely as a numeri- 
cal regularization trick, while in our work, it plays an 
important role in ensuring optimal information flow 
between the simulation data at different time steps. 



3. Data assimilation methods 

Data assimilation methods are probably most sim- 
ilar in spirit to IFD. Data assimilation methods are 
used in weather forecast calculations to impose con- 
straints from past measurements on numerical simu- 
lation of the atmosphere. A recent comparison of such 
methods can be found in [37]. The gold standard of 
the field is the full Bayesian posterior distribution of 
the dynamical system given all data. Typically, there 
are two broad classes of algorithms used to approxi- 
mate this in a computationally affordable way: parti- 
cle ensemble filters and variational methods. 

Particle filter represent the knowledge and uncer- 
tainty on the system state as an ensemble of realiza- 



tions, called the particles. These evolve individually 
according to the system dynamics to later times, when 
new measurements are available. Then, the particles 
are selected and/or re- weighted according to their in- 
dividual consistency with the new data. Resampling 
this distribution with a new set of particles (now with 
equal weights) closes the loop and prepares for the 
next simulation time step. A recent discussion of such 
methods can be found in [35] , 

Ensemble Kalman filters represent the system 
knowledge as well as an ensemble of realizations that 
can be propagated by the full non-linear dynamics in 
time. The data assimilation step, however, is not done 
via re-weighting or re-sampling, but by Kalman fil- 
tering. Kalman filtering is basically Wiener filtering, 
which we introduce in Sect. |II A[ while using an em- 
pirically determined signal covariance matrix. This is 
computed from the ensemble, which is informed by 
the actual external measurement data. 

Variational methods for data assimilation 
combine the action of a Lagrangian determining the 
dynamics and a loss function describing a penalty for 
any mismatch of the model prediction and the data 
[39] , From this combined Lagrangian, combining dy- 
namics and data constraints, a variational equation 
aries that satisfies both the system dynamics and the 
data constraints. Variational methods treat infor- 
mation processing and field dynamics simultaneously, 
similar to IFD. 

A third approach to data assimilation has recently 
been proposed for the simulation of cosmic structure 
formation 40 42 . There the full posteriori of the 
cosmic matter field as determined by galaxy catalogs 
and the Gaussian initial condition statistics of cosmic 
structure formation is sampled via a Hamiltonian 
sampling method. 

Appendix B: Maximum entropy principle 

The MEP [7HT0] is uniquely specified by the fol- 
lowing three requirement on how probabilities should 
be ranked and updated with respect to new informa- 
tion. Entropy is defined to quantify how well a given 
PDF represents a knowledge state. Its functional form 
is determined by three requirements on the resulting 
probability updating scheme: 

• Locality: Local information has local effects; 
information that affects only some part of the 
phase space should not modify the entropy and 
the implied MEP PDF in case this area is dis- 
carded. 

• Coordinate invariance: The system of coor- 
dinates of the phase space does not carry in- 
formation. Entropy should be invariant under 
coordinate transformation as well as the deter- 
mined MEP PDF. 

• Independence: Independent systems can be 
treated jointly or separately, yielding the same 
entropy in both cases. The joint MEP PDF 
must therefore be separable into a product of 
PDFs for the individual systems. 
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The unique (up to trivial rescaling) entropy functional 
on PDFs that is consistent with these requirements is 
given by ( 13 ) as it was shown by [7HIU] . The usual way 
to use this entropy in order to specify the PDF V{<j>) 
is to maximize it subject to some constraints imposed 
on certain moments of the signal field statistics. An 
obvious one is the proper normalization (L 



V(<t>) 



1 



of the PDF, but also a number of higher moments 
might be known a priori, and summarized in the form 



(fi(<f>)) 



V(4>) 



Here the functions could be simple 
>4>\ etc. or more complicated func- 



momcnts like <j), q\ 
tions thereof. These constraints on PDF moments are 
then incorporated into the entropy via Lagrange mul- 
tiplier or thermodynamical potentials [i and A = (Aj)j: 



s(v,fi,x\Q) 
■■S(V\Q)-(n + \i /(</>)) 



- / V4>V{$) 



log 



/'- - 



A f /(0) 



Maximizing this entropy with respect to all compo- 
nents of V{4>) yields 



where 



1 W Z(X) 



Z{\) = / V<t>Q(4>)e- x 'f^ 



(B2) 



(B3) 



ensures proper normalization, and theLagrange po- 
tentials A have to be chosen to satisfy 



d x s = d x \ og z= / v<j>v{4>)f{4>) = (/WO) 



(B4) 



In Sect. |IlTB| it is claimed that the MEP distri- 
bution for (j> with known mean ip and covariance $ 
is the Gaussian Q(4> — This can now be veri- 

fied by a short calculation. The entropy (Bl| can be 



constrained by the knowledge of zero, first, and sec- 
ond moments of the field via the Lagrange-multiplier 
scalar fi, field A, and matrix A, respectively: 



S(V,fi, A,A|Q) (B5) 
= S(V\Q) - ft - At (0) (0) - Tr (A (<W>% } ) 



(Bl) = - f Vcf>V{cj>) 



\ (V{4>) 

log Km 



Minimizing this with respect to all components of 
V{<j>) for a flat prior-prior Q(4>) = const subject to 
the constraints 



-d M S 
-d x S 
-d A S 



= (% = 
= (<%) = 



■■ 1, 

= ip, 
= $ 



• ip if;* 



(B6) 
(B7) 
(B8) 



to ensure proper PDF normalization, mean, and vari- 
ance, respectively, yields V((j)\ip, $) = Q{<j> — ip, $) as 
assumed in (27). 
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